Conversation
baogorek
left a comment
There was a problem hiding this comment.
Looks good, just please update this one bullet point in the PR body (if this is indeed correct):
Says:
- Remove roth_ira_contributions target ($39B) — structurally $0 due to CPS allocation bug where traditional IRA always exhausts the IRA limit first
But:
code actually keeps roth_ira_contributions and updates it from $39B to $35.0B in both loss.py and etl_national_targets.py (which it can do be
|
@baogorek rewrote the PR fairly substantially since your last review. My first attempt at this so please be as nit-picky as needed. |
Replace hard-coded RETIREMENT_LIMITS dict with a shared utility that reads from policyengine-us's parameter tree at runtime. This ensures limits stay in sync as policyengine-us is updated, and eliminates a maintenance risk when the same dict is duplicated in puf_impute.py (PR #554). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
baogorek
left a comment
There was a problem hiding this comment.
@PavelMakarchuk In full transparency, my review was focused on how disruptive this PR would be to #538, which fortunately is not at all. Looking at the printout of items I got from Claude, it looks like there are some duplication issues that are not that serious.
Point 5 about the subsampling is interesting. I believe this is where @juaristi22 found that 40,000 was where diminishing returns started to kick in. But I'm reading that this is new code, and if you found that it works well, it's not a blocker.
I'm going to go ahead and approve, but please consider these points before you merge:
- Retirement limits duplicated between cps.py and puf_impute.py — _get_retirement_limits() is a copy-paste of the
limits dict in cps.py with the comment "duplicated here to avoid circular imports." These should be extracted to a
shared location — imputation_parameters.yaml already holds the allocation shares and would be a natural home for the
limits too. Two copies will inevitably drift. - _qrf_ss_shares swallows exceptions silently — The except Exception block returns None without logging the
exception or traceback. _impute_retirement_contributions at least logs %s, e, but neither logs exc_info=True. When
these fail in production you'll have no idea why. - TestLimitsMatchCps doesn't actually cross-check against cps.py — Despite the name, this test just re-calls
_get_retirement_limits() with hardcoded values. A real cross-check would import the CPS limits and assert equality —
which would also catch the duplication problem from point 1. - Duplicate conftest.py — Both root conftest.py and policyengine_us_data/tests/test_calibration/conftest.py mock
microimpute identically. Root conftest propagates downward in pytest, so the inner one is redundant. Note: the
calibration test conftest on my branch has different fixtures (db_uri, dataset_path), so we'll have a merge conflict
— easy to resolve by combining both into one file, but worth being aware of. - QRF training subsample of 5000 is undocumented — CPS has ~200K person records. Why 5000? If it's for speed, a
comment explaining the tradeoff would help. If arbitrary, a larger sample would improve predictions. - SE + wages people get $0 in 401(k)/IRA — In the proportional RETCB_VAL allocation, if someone has both SE income
and wages, the entire amount goes to self_employed_pension_contributions first, leaving remaining = 0. Many people
have W-2 jobs plus side gigs. Worth at least a comment acknowledging this as a known limitation. - Validation script hardcodes targets — validation/validate_retirement_imputation.py re-declares all target values
instead of importing NATIONAL_CALIBRATION_TARGETS from loss.py. These will drift.
Worth discussing:
- SE + wages people get $0 in 401(k)/IRA — In the proportional RETCB_VAL allocation, if someone has both SE income
and wages, the entire amount goes to self_employed_pension_contributions first, leaving remaining = 0. Many people
have W-2 jobs plus side gigs. Worth at least a comment acknowledging this as a known limitation. - Validation script hardcodes targets — validation/validate_retirement_imputation.py re-declares all target values
instead of importing NATIONAL_CALIBRATION_TARGETS from loss.py. These will drift.
Looks good:
- _drop_formula_variables is a solid cleanup — my stacked dataset builder already follows this pattern (filtering to
input_variables), so making ExtendedCPS generation consistent is welcome. - Calibration target sourcing is well-documented.
- The conftest try/except ImportError fix for the microimpute mock is the right approach.
* Read retirement limits from policyengine-us parameters Replace hard-coded RETIREMENT_LIMITS dict with a shared utility that reads from policyengine-us's parameter tree at runtime. This ensures limits stay in sync as policyengine-us is updated, and eliminates a maintenance risk when the same dict is duplicated in puf_impute.py (PR #554). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fixup! Read retirement limits from policyengine-us parameters * Revert test formatting to match CI Black version The fixup reformatted SQL strings in test_database_build.py using a newer Black version that disagrees with CI's lgeiger/black-action. Revert to the original formatting that CI accepts. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add tests for retirement limits utility Verifies get_retirement_limits returns correct IRS contribution limits for 2020, 2023, and 2025 by comparing against known values. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Handle both old and new policyengine-us catch-up parameter layouts The SECURE 2.0 update (merged to pe-us main, not yet released) renames children["401k"] to children["k401"] and changes it from a simple int to an age-bracketed scale. Handle both layouts so the code works with the current release and future releases. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Use k401 parameter directly, bump policyengine-us>=1.572.5 Drop dual-path handling in favor of the current parameter layout (k401 scale with age brackets from SECURE 2.0). Bump the minimum policyengine-us version to 1.572.5 which introduced this parameter. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
|
I've merged #566 which extracts the retirement limits into a shared utility ( After rebasing, this should be good to merge once CI passes. Nice work on the QRF imputation approach. |
|
Update: tried to rebase this for you but the branch is based on a very old commit (217 commits behind main). The cherry-pick/rebase hits conflicts because several files have been restructured since then (puf_impute.py, etl_national_targets.py, cps.py, loss.py, etc.). You'll need to rebase locally — you know the retirement code best so the conflict resolution will be straightforward for you. After rebasing, you can also import from |
- Correct traditional_ira_contributions: $25B → $13.2B (SOI 1304 Table 1.4 actual deduction, not total contributions) - Add traditional_401k_contributions: $567.9B (BEA/FRED employee DC contributions) - Add self_employed_pension_contribution_ald: $29.5B (SOI 1304 Table 1.4 Keogh plan deduction) - Remove roth_ira_contributions: structurally $0 due to CPS allocation bug (#553) - Update both loss.py and etl_national_targets.py Closes #553 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace sequential waterfall with proportional allocation based on administrative data. The old waterfall gave 401(k) first priority, consuming all of RETCB_VAL and leaving IRA contributions at $0 for every record. The Roth IRA allocation was also mathematically guaranteed to produce $0. The new approach splits RETCB_VAL proportionally: - DC vs IRA: 90.8% / 9.2% (BEA/FRED vs IRS SOI) - Within DC: 85% traditional / 15% Roth (Vanguard/PSCA) - Within IRA: 39.2% traditional / 60.8% Roth (IRS SOI Tables 5 & 6) All fractions are stored in imputation_parameters.yaml with sources. Contribution limits are still enforced. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Split 401(k) target: $567.9B total employee DC deferrals (BEA/FRED) into traditional $482.7B (85%) and Roth $85.2B (15%) using Vanguard How America Saves 2024 dollar share estimate - Add roth_ira_contributions target: $35.0B from IRS SOI Accumulation Tables 5 & 6 (TY 2022) — direct administrative source - Update etl_national_targets.py in parallel Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When PUF imputation replaces social_security values, the sub-components (retirement, disability, survivors, dependents) were left unchanged, creating a mismatch. This caused a base-year discontinuity where projected years had ~3x more SS recipients than the base year, producing artificial 9-point Gini swings. The new reconcile_ss_subcomponents() function rescales sub-components proportionally after PUF imputation. New recipients (CPS had zero SS but PUF imputed positive) default to retirement. Fixes #551 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
New PUF recipients (CPS had zero SS) now get assigned based on age: >= 62 -> retirement, < 62 -> disability. Matches the CPS fallback logic. Falls back to retirement if age is unavailable. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Instead of the simple age >= 62 heuristic, train a QRF on CPS records (where the reason-code split is known) to predict shares for new PUF recipients. Uses age, gender, marital status, and other demographics. Falls back to age heuristic when microimpute is unavailable or training data is insufficient (< 100 records). 14 tests covering: - Proportional rescaling (existing recipients) - Age heuristic fallback (4 tests) - QRF share prediction (4 tests including sum-to-one and elderly-predicted-as-retirement) - Edge cases (zero imputation, missing subs, no SS) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
CPS-PUF link is statistical (not identity-based), so the paired CPS record's sub-component split is just one noisy draw. A QRF trained on all CPS SS recipients gives a better expected prediction. Also removes unnecessary try/except ImportError guard for microimpute. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Variables with formulas in policyengine-us are recomputed by the simulation engine, so storing them wastes space and can mislead validation. This removes 9 such variables from the extended CPS output (saving ~15MB). Also adds tests verifying: - No formula variables are stored (except person_id) - Stored input values match what the simulation computes - SS sub-components sum to total social_security per person Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Update _drop_formula_variables and tests to catch variables that use `adds` or `subtracts` (e.g. social_security), not just explicit formulas. These are also recomputed by the simulation engine. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
PUF clones previously copied CPS retirement contributions blindly, so a record with $0 wages could have $50k in 401(k) contributions. Train a QRF on CPS data (which has realistic income-to-contribution relationships) and predict onto the PUF half using PUF-imputed income. Post-prediction constraints enforce contribution caps, zero-out rules for records with no wages/SE income, and non-negativity. - Remove traditional_ira_contributions from IMPUTED_VARIABLES - Add CPS_RETIREMENT_VARIABLES, RETIREMENT_PREDICTORS constants - Add _get_retirement_limits() with year-specific 401k/IRA caps - Add _impute_retirement_contributions() following _impute_weeks_unemployed pattern - Integrate into puf_clone_dataset() variable routing loop - Add 34 unit tests covering constraints, routing, and limits Closes #561 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add income predictors (interest, dividends, pension, SS) to the retirement contribution QRF, matching issue #561's recommendation. Split RETIREMENT_PREDICTORS into demographic and income sublists so the test side correctly sources income from PUF imputations. Also add validation/validate_retirement_imputation.py for post-build constraint checking and aggregate comparison against calibration targets. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wrap QRF fit/predict in try/except so a crash returns zeros instead of blowing up the entire puf_clone_dataset build. Document the pre_tax_contributions inconsistency (OVERRIDDEN vs CPS-trained sub-components) for future reconciliation. Tests: add test_qrf_failure_returns_zeros and test_training_data_failure_returns_zeros (50 total). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The conftest.py files used `if "microimpute" not in sys.modules` to decide whether to install a MagicMock. On CI (Python 3.13, where microimpute is fully installed), this condition was True at pytest startup because no code had imported microimpute yet. The mock replaced the real package, so tests that triggered CPS generation (add_rent → QRF imputation) silently got a MagicMock whose __len__ returns 0, causing the "0 input values" ValueError. Fix: use try/import instead of checking sys.modules, so the mock is only installed when microimpute genuinely cannot be imported. Also restores policyengine-us to 1.587.0 (the revert in 2566ac1 was a misdiagnosis — the conftest mock was the real root cause). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Move retirement contribution limits to imputation_parameters.yaml (single source of truth for cps.py and puf_impute.py) - Add exc_info=True to QRF exception handlers for full tracebacks - Replace TestLimitsMatchCps with TestLimitsMatchYaml that actually cross-checks against the shared YAML source - Remove redundant microimpute mock from test_calibration/conftest.py (root conftest already propagates) - Document QRF training subsample of 5000 rationale - Import targets/limits in validation script from loss.py and YAML instead of hardcoding Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously, filers with SE income had ALL of RETCB_VAL allocated to self_employed_pension_contributions, leaving $0 for 401(k)/IRA even when they also had wages. SE-only filers also got $0 in IRA despite being eligible. Changes: - Cap SE pension at min(25% of SE income, IRS dollar limit) so dual-income filers retain a remainder for 401(k) and IRA - Gate IRA pool on has_earned_income (wages OR SE) instead of has_wages only, so SE-only filers can receive IRA contributions - IRA pool = remaining - dc_pool (absorbs all non-DC remainder) - Add SE pension rate and dollar limits to imputation_parameters.yaml (2020-2025, sourced from IRS one-participant 401k rules) - Apply matching SE pension cap in PUF QRF constraints - Add test_se_pension_capped_at_rate_times_income No change to calibration targets (independently sourced from BEA/FRED and IRS SOI). Pre-calibration aggregates should shift closer to targets since dual-income filers now contribute to all account types. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Replace _get_retirement_limits() YAML loader with wrapper that merges policyengine-us params (401k/IRA) + YAML SE pension params - Remove retirement_contribution_limits block from YAML (now sourced from policyengine-us parameter tree via utils/retirement_limits.py) - Update validation script to use get_retirement_limits() directly - Update tests for new structure (no more year clamping for 401k/IRA) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
942de14 to
a1446c4
Compare
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
@baogorek Thanks for the thorough review — all 7 points have been addressed:
CI: all quick checks pass (lint, lock freshness, smoke test, changelog, fork check). Full Modal build/test still running. |
MaxGhenis
left a comment
There was a problem hiding this comment.
Thorough and well-sourced PR. Key improvements:
- Proportional RETCB_VAL split (replacing waterfall that starved IRA contributions) using BEA/FRED + SOI admin data shares
- QRF retirement imputation for PUF half with proper IRS limit capping and income eligibility gating
- SS sub-component reconciliation (QRF + age heuristic fallback) ensuring subs sum to imputed total
- Formula variable stripping from stored dataset to prevent stale values misleading validation
- Calibration targets corrected with proper SOI/BEA sources (traditional IRA $25B -> $13.2B, 401(k) split added)
- Comprehensive tests (1200+ lines) covering constants, limits, QRF mock behavior, routing, SS reconciliation, and dataset integrity
Already uses the shared get_retirement_limits utility from #566.
Summary
Calibrates retirement contribution variables in the CPS/Extended CPS pipeline with three major changes:
imputation_parameters.yaml(dc_share=0.908, ira_share=0.092, roth_dc_share=0.15, trad_ira_share=0.392)Additional changes
cps.py/puf_impute.pyto single source inimputation_parameters.yaml_qrf_ss_sharesand_impute_retirement_contributionsnow log full tracebacks on failuretest_calibration/conftest.pymock removed (root conftest propagates)loss.pyand limits from YAML instead of hardcodingTest plan
test_retirement_imputation.py)test_no_formula_variables_stored.py)test_cps_has_auto_loan_interest/test_cps_has_net_worthfailures)validation/validate_retirement_imputation.py) confirms aggregates match targets🤖 Generated with Claude Code